Interpretable & Explorable Approximations of Black Box Models

نویسندگان

  • Himabindu Lakkaraju
  • Ece Kamar
  • Rich Caruana
  • Jure Leskovec
چکیده

We propose Black Box Explanations through Transparent Approximations (BETA), a novel model agnostic framework for explaining the behavior of any black-box classi€er by simultaneously optimizing for €delity to the original model and interpretability of the explanation. To this end, we develop a novel objective function which allows us to learn (with optimality guarantees), a small number of compact decision sets each of which explains the behavior of the black box model in unambiguous, well-de€ned regions of feature space. Furthermore, our framework also is capable of accepting user input when generating these approximations, thus allowing users to interactively explore how the black-box model behaves in di‚erent subspaces that are of interest to the user. To the best of our knowledge, this is the €rst approach which can produce global explanations of the behavior of any given black box model through joint optimization of unambiguity, €delity, and interpretability, while also allowing users to explore model behavior based on their preferences. Experimental evaluation with realworld datasets and user studies demonstrates that our approach can generate highly compact, easy-to-understand, yet accurate approximations of various kinds of predictive models compared to state-of-the-art baselines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hybrid Decision Making: When Interpretable Models Collaborate With Black-Box Models

Interpretable machine learning models have received increasing interest in recent years, especially in domains where humans are involved in the decision-making process. However, the possible loss of the task performance for gaining interpretability is o‰en inevitable. Œis performance downgrade puts practitioners in a dilemma of choosing between a top-performing black-box model with no explanati...

متن کامل

Programs as Black-Box Explanations

With increasing complexity of machine learning systems being used1, there is a crucial need for providing insights into what these models are doing. Model-agnostic approaches [18], such as Baehrens et al. [1] and Ribeiro et al. [17], have shown that insights into complex, black-box models do not have to come at a cost of accuracy, and that accurate local explanations can successfully be provide...

متن کامل

Interpretable Recurrent Neural Networks Using Sequential Sparse Recovery

Recurrent neural networks (RNNs) are powerful and effective for processing sequential data. However, RNNs are usually considered “black box” models whose internal structure and learned parameters are not interpretable. In this paper, we propose an interpretable RNN based on the sequential iterative soft-thresholding algorithm (SISTA) for solving the sequential sparse recovery problem, which mod...

متن کامل

Identification of Numerically Accurate First-Order Takagi-Sugeno Systems with Interpretable Local Models from Data

The paper deals with the interpretability problem of 1 order Takagi-Sugeno systems and interpolation issues in particular. Interpolation improvement is carried out by a corrective secondary model (essentially a black box) complementing the primary (interpretable) model. Optimization technique for this two-model configuration is developed. Experimental results suggest that this approach achieves...

متن کامل

Selection Of Physically Interpretable Data Driven Model Structures To Analyze Industrial Processes

This work presents a black-box input selection approach to reveal causal dependencies between process variables of complex industrial systems. This allows data based modeling with physically interpretable model structure. For this purpose a method is used which combines statistical and analytical approaches to find causal relations between measured data, detection of control loops and the inter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1707.01154  شماره 

صفحات  -

تاریخ انتشار 2017